1. What is Computer Vision?
Computer Vision is a branch of Artificial Intelligence that enables
computers to detect, process, analyze and understand visual information
from images and videos.
🧠Remember:
Human → Understands visual scenes naturally
Computer Vision → Tries to teach machines to do the same
2. Goal of Computer Vision
Convert image/video data into meaningful information for decision making.
- Recognize objects
- Recognize people
- Understand scenes
- Interpret activities
Exam Keyword:
Image/Video → Information
3. Human Vision vs Computer Vision
| Human Vision |
Computer Vision |
| Naturally interprets scenes |
Requires algorithms and data |
| Handles ambiguity well |
Can be fooled easily |
| Fast understanding |
Needs computation |
🧠Human vision remains more robust than machines in many situations.
4. Fields Related to Images and Videos
| Field |
Main Purpose |
| Computer Graphics |
Create images |
| Image Processing |
Manipulate images |
| Computer Vision |
Understand images |
🧠Shortcut:
Graphics → Create
Processing → Improve
Vision → Understand
5. Three Main Areas of Computer Vision
- Measurement
- Perception and Interpretation
- Search and Organization
Measurement:
Recover information about the real 3D world from images.
Perception & Interpretation:
Recognize objects, people, activities and scenes.
Search & Organization:
Find and organize visual data efficiently.
6. Measurement
Reconstructing 3D models from multiple images.
Computer vision estimates properties of the real world from visual data.
7. Perception and Interpretation
Detecting faces, recognizing objects and understanding scenes.
🧠Think:
"What is in the image?"
8. Search and Organization
Google Image Search and image retrieval systems.
Visual data can be indexed, searched and categorized automatically.
9. Applications of Computer Vision
- Autonomous vehicles
- Face recognition
- Biometrics
- Optical Character Recognition (OCR)
- Medical imaging
- Industrial inspection
- Retail automation
- Satellite image analysis
- Space exploration
- Gaming and interaction systems
Exam Tip:
Be able to explain at least 3 real-world applications.
10. Biometrics and Recognition
| Technology |
Purpose |
| Face Recognition |
Identity verification |
| Fingerprint Recognition |
User authentication |
| Iris Recognition |
High-security identification |
11. Optical Character Recognition (OCR)
OCR converts images containing text into machine-readable text.
License plate recognition,
digit recognition,
scanned document conversion.
12. Why Computer Vision is Difficult
Computer Vision is an Ill-Posed Problem.
The real world is 3D, but images are only 2D projections.
🧠Remember:
Many different real-world situations can produce the same image.
13. Digital Images
A digital image is composed of pixels arranged in rows and columns.
🧠Pixel = Smallest unit of a digital image.
14. Image Processing Tasks
- Pixel Manipulation
- Filtering
- Restoration
- Enhancement
- Edge Detection
- Segmentation
These are core low-level image processing operations.
15. Colour Image Processing
- Colour Spaces
- Colour Manipulation
- Colour Analysis
RGB is the most common colour representation.
16. Higher-Level Vision Tasks
- Representation
- Description
- Recognition
- Interpretation
- Semantic Understanding
Modern deep learning systems perform object detection and scene understanding.
17. Emerging Applications
- Visual Captioning
- Visual Question Answering (VQA)
- Egocentric Vision
- Fashion Recommendation Systems
- Autonomous Systems
18. Final Exam Summary
Most Important Points
- Computer Vision: Image/Video → Information
- Image Processing: Image → Image
- Graphics: Create images
- Three Areas: Measurement, Perception, Search
- Challenge: 3D world projected to 2D images
- OCR: Image text → Digital text
- Applications: Face recognition, medical imaging, autonomous vehicles
- Digital Image: Collection of pixels